54 research outputs found

    A methodology for the generation of efficient error detection mechanisms

    Get PDF
    A dependable software system must contain error detection mechanisms and error recovery mechanisms. Software components for the detection of errors are typically designed based on a system specification or the experience of software engineers, with their efficiency typically being measured using fault injection and metrics such as coverage and latency. In this paper, we introduce a methodology for the design of highly efficient error detection mechanisms. The proposed methodology combines fault injection analysis and data mining techniques in order to generate predicates for efficient error detection mechanisms. The results presented demonstrate the viability of the methodology as an approach for the development of efficient error detection mechanisms, as the predicates generated yield a true positive rate of almost 100% and a false positive rate very close to 0% for the detection of failure-inducing states. The main advantage of the proposed methodology over current state-of-the-art approaches is that efficient detectors are obtained by design, rather than by using specification-based detector design or the experience of software engineers

    Data mining for vehicle telemetry

    Get PDF
    This paper presents a data mining methodology for driving condition monitoring via CAN-bus data that is based on the general data mining process. The approach is applicable to many driving condition problems and the example of road type classification without the use of location information is investigated. Location information from Global Positioning Satellites and related map data are often not available (for business reasons), or cannot represent the full dynamics of road conditions. In this work, Controller Area Network (CAN)-bus signals are used instead as inputs to models produced by machine learning algorithms. Road type classification is formulated as two related labelling problems: Road Type (A, B, C and Motorway) and Carriageway Type (Single or Dual). An investigation is presented into preprocessing steps required prior to applying machine learning algorithms, namely, signal selection, feature extraction, and feature selection. The selection methods used include Principal Components Analysis (PCA) and Mutual Information (MI), which are used to determine the relevance and redundancy of extracted features, and are performed in various combinations. Finally, as there is an inherent bias towards certain road and carriageway labellings, the issue of class imbalance in classification is explained and investigated. A system is produced, which is demonstrated to successfully ascertain road type from CAN-bus data, and it is shown that the classification correlates well with input signals such as vehicle speed, steering wheel angle, and suspension heigh

    Data mining for vehicle telemetry

    Get PDF
    This article presents a data mining methodology for driving-condition monitoring via CAN-bus data that is based on the general data mining process. The approach is applicable to many driving condition problems, and the example of road type classification without the use of location information is investigated. Location information from Global Positioning Satellites and related map data are often not available (for business reasons), or cannot represent the full dynamics of road conditions. In this work, Controller Area Network (CAN)-bus signals are used instead as inputs to models produced by machine learning algorithms. Road type classification is formulated as two related labeling problems: Road Type (A, B, C, and Motorway) and Carriageway Type (Single or Dual). An investigation is presented into preprocessing steps required prior to applying machine learning algorithms, that is, signal selection, feature extraction, and feature selection. The selection methods used include principal components analysis (PCA) and mutual information (MI), which are used to determine the relevance and redundancy of extracted features and are performed in various combinations. Finally, because there is an inherent bias toward certain road and carriageway labelings, the issue of class imbalance in classification is explained and investigated. A system is produced, which is demonstrated to successfully ascertain road type from CAN-bus data, and it is shown that the classification correlates well with input signals such as vehicle speed, steering wheel angle, and suspension height

    Automated taxonomy generation for summarizing multi-type relational datasets

    Get PDF
    Taxonomy construction provides an efficient navigating and browsing mechanism to people by organizing large amounts of information into a small number of hierarchical clusters. Compared with manually editing taxonomies, Automated Taxonomy Generation has numerous advantages and has therefore been applied to categorize document collections. However, the utility of this technique to organize and represent relational datasets has not been investigated, because of its unaffordable computational complexity. In this paper we propose a new ATG method based on the relational clustering framework DIVA. By incorporating the idea of Representative Objects, the computational complexity can be greatly reduced. Moreover, we analyze the divergence of the data attributes and label the taxonomic nodes accordingly. The quality of the derived taxonomy is quantitatively evaluated by a synthesized criterion that considers both the intra-node homogeneity and inter-node heterogeneity. Theoretical analysis and experimental results prove that our approach is comparably effective and more efficient than other ATG algorithms

    Multi-type relational clustering approaches : current state-of-the-art and new directions

    Get PDF
    The proliferation of multi-type relational datasets in a number of important real-world applications and the limitations resulting from the transformation of such datasets to fit propositional data mining approaches have led to the emergence of the discipline of multi-type relational data mining. Clustering is an important unsupervised learning task aimed at discovering structure inherent in data. In this paper, we survey the state-of-the-art in the field of relational clustering, providing a taxonomy of approaches and review some of the most representative algorithms within each category. We also present DIVA, our general framework for multi-type relational clustering, which combines the use of Representative Objects with multi-phase clustering in a bid to provide flexibility, efficiency and effectiveness in clustering relational datasets. Theoretical analysis and experimental results prove that our approach is more effective and efficient than a number of other algorithms proposed in literature

    Retinal fundus image constrast normalization using mixture of gaussians

    Get PDF
    We present a fast and robust method to correct contrast variation in retinal fundus imagery. The technique uses a mixture of Gaussians to model the bias of the intensity variation. Typically a three or four component mixture is sufficient to characterize the principal variation due to the spherical geometry of the retina, the high-contrast reflection off the optic nerve and the darker macula. We compare the results with a non-parametric, filtering approach on a standard diabetic retinopathy database of 89 images. Our results indicate that a parametric approach using mixture Gaussian is better at contrast stretching in lesion regions making is an effective pre-processing step for manual and computer aided diagnostic techniques

    Context and customer behaviour in recommendation

    Get PDF
    The last few years have seen an increased interest in incorporating context within recommender systems. However, little empirical evidence has emerged to support the premise that context can actually improve recommendation accuracy. Indeed little agreement exists as to what represents the context of a user or indeed how such context should be used within a recommendation strategy. In this paper we study the effect of incorporating contextual variables, both observable and derived from past user behavior, on the accuracy of a content based recommender system. The system was evaluated using data from an Italian online retailer. Results suggest a significant improvement in performance when using contextual variables
    • ā€¦
    corecore